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Abstract This paper assesses the use of single nucleotide 
polymorphisms (SNPs) for forensic analysis. It demon- 
strates that relatively small arrays of a^prox. 50 loci are 
comparable to existing short tandem repeat (STR) multi- 
plexes. A quantitative test, however, is a prerequisite for 
mixture interpretation. In addition, as the mixture propor- 
tion becomes low, it will be necessary to distinguish be- 
tween the allele and background. Relatively small biai- 
lelic arrays are also suitable to distinguish between close- 
ly related individuals such as brothers. 
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Introduction 

There is increasing interest in the use of biallelic markers 
or single nucleotide polymorphisms (SNPs) for forensic 
purposes (Syvanen et al. 1993). Several formats have 
been used for PCR-based biallelic assays: the reverse dot 
blot (Saiki et al. 1988) applied to HLA DQ-aipha and 
Polymarker systems, microtitre-based formats (Kostyu et 
al. 1993) and finally microfabricated arrays on <>iass 
(Southern et al. 1992, 1994; Guo et al. 1994). The Tatter 
are of special interest since the potenual exists to build ar- 
rays consisting of hundreds of loci. This paper specifi- 
cally explores the potential of biallelic arrays, particularly 
with respect to the analysis of mixtures. Ail of the plat- 
forms are non-electrophoretic, 

A crucial aspect of forensic DNA typing is the inter- 
pretation of mixtures (Evett et al. 1991; Weir et al. 1997). 
Until recently, statistical interpretation of mixtures has 
proceeded without considering differences in signal 
strength of heterozyeotes at a locus. Evett et al. (1998), 
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Clayton et al. (1998) and Gill et al. (1998) reported meth- 
ods to interpret mixed STR profiles based on identifica- 
tion of the allele peak areas. Although intended for STR 
(electrophoretic) analysis, the principles can be extended 
to encompass biallelic loci on non-electrophoretic media. 



How large does an array need to be? 

Typically an array of biallelics could comprise several 
hundred loci that are typed from a single individual. In 
this paper 1 consider relatively small arrays of 50-150 
loci. Excluding the possibility of genetic *nulls\ a consid- 
eration of each locus in turn must fall into one of two cat- 
egories - either one allele will be visible or two alleles 
will be seen. The notation A, B is used to denote the two 
alleles where a and b denote their respective frequencies - 
only AA, AB or BB are possible (because a + b = 1, all for- 
mulae could be expressed solely in terms of a). If a mi- 
crofabricated array consists of n loci, the match probabil- 
ity can be approximated by making the simplified as- 
sumption that the frequency of A (a) is constant for every 
locus in the array. For n different loci in an array, the num- 
ber of AA genotypes is a 2 n f the number of AB genotypes 
is labn and the number of BB genotypes is b-n. For ex - 
ample if n = 1 00 and q — 0.5, then .50 loci will be het- 
erozygote and 50 will be homozygote (AA and BB in 
equal proportion). Therefore, the match probability across 
the entire array is (a*)* x (2a6) 50 x (b 2 Y>. If estimated as a 
likelihood ratio (LRJ: 

Figure 1 shows simulations for arrays ranging from 
50-150 loci. A relatively small array of 50 gives likeli- 
hood ratios equivalent to approximately 12 STRs over a 
wide range of a > 0.2 < 0.8. Note that the plots in Fig. I 
are symmetrical, so that the LR tu M 08) is the same as 
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Fig. 1 Estimates of LR n from 
arrays of n loci, assuming fa is 
constant across the set 
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Analysis of mixtures 

Assuming two contributors to the mixture, if one allele 
shows then both must be homozygous for the same allele 
(AAAA or 3BJBB). 

If there are two alleles visible, and assuming that there 
are two contributors to a mixture, (suspect S and victim V, 
respectively) then the following genotype combinations 
are possible: .\A M\ AA,BB\ AB£B\ ABM and all of the 
reverse possibilities (Weir el al. 1997). This makes a total 
of nine possible genotype combinations (m = 1...9), all of 
which may be represented in a mixture. Given a normal 
outbreeding population, the proportion of observations of 
all of the above mixture types can be estimated given a. 

Contributors to the mixture are the suspect 
and an unknown individual 

For example, suppose that a blood stain is retrieved from 
a crime scene and the genotypes are consistent with a 
combination of the suspect (S) with an unknown individ- 
ual (U). 

We consider the following conditions in the likelihood 
ratio: 

C: Contributors were the suspect and unknown 

C: Contributors were two unknown individuals 

For each locus, calculation of the likelihood ratio depends 
upon the genotype of the suspect and the alleles observed 
in the mixture and there are three broad categories to con- 
sider. 

Category I 



C = lab + b 2 

C = 6a-b 2 + Aa l b + Aab 2 

LR = [lab + b 2 )/(6a 2 b 2 + Aa^b + Aa&) 

Category 2 

The suspect is heterozygous (AB) and the profile is AB. 
(U) must be AA, AB or BB and: 

C = (a h- b) 2 

C is the same as in category 1 above. 
Category 3 

The suspect is homozygous (AA) and the profile shows 
just one allele. (U) is AA and the LR is l/a 2 . 

A complete list of numerators and denominators is 
given in Table L The proportion of an array of n loci hav- 
ing a particular mixture type (m) is fm: Each locu;; has mp 
= 9 possible mixture genotype combinations each (listed 
in Table 1). 

The total LR„ of a mixture in an array of n loci is: 

mp 

Simulatio n of typical (average) mixture statistics on the 
combined LR„ for any number of biallelic loci was carried 
out under the simplified assumption that the allele propor- 
tion (a) for each locus is the same across loci (Fig. 2). 
LR„ maximises when a is high (0.8) or low (0.2). A bat- 
tery of 50 loci with frequencies of alleles ranging between 
0.1-0.9 will give a minimum LR of 10 4 . 



The suspect is homozygous (AA) and the mixture is AB 
(U) must be either AB or BB 
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Fig. 7 A comparison of likelihood ratios from arrays of 50, 100 
and 150 loci, respectively, under the assumption that the suspect 
and perpetrator are related. The allele proportion (fa) is 0.5 . On 
the X-axis: J) full brothers. 2) father and son, 3) first cousins, 4) 
unrelated 

tion can still proceed provided that cumulative probability 
functions can be used to estimate p(null). Interpretation of 
more than two individuals contributing to a mixture will 
present a major challenge. Independence assumptions 
have not been assessed in this paper; however, it is in- 
evitable that due consideration will be needed with large 
arrays. 

Currently, the greatest problem in developing useful 
SNP arrays for forensic use is not related to statistical is- 
sues, rather, the problems are biochemical. Making a large 
balanced' multiplex of ca. 50 loci from less than I ng of 
genomic template is indeed a daunting prospect. 
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